Modeling Projections in Microaggregation
نویسنده
چکیده
Microaggregation is a method used by statistical agencies to limit the disclosure of sensitive microdata. It has been proven that microaggregation is an NP-hard problem when more than one variable is microaggregated at the same time. To solve this problem in a heuristic way, a few methods based on projections have been introduced in the literature. The main drawback of such methods is that the projected axis is computed maximizing a statistical property (e.g., the global variance of the data), disregarding the fact that the aim of microaggregation is to keep the disclosure risk as low as possible for all records. In this paper we present some preliminary results on the application of aggregation functions for computing the projected axis. We show that, using the Sugeno integral to calculate the projected axis, we can reduce in some cases the disclosure risk of the protected data (when projected microaggregation is used).
منابع مشابه
A novel local search method for microaggregation
In this paper, we propose an effective microaggregation algorithm to produce a more useful protected data for publishing. Microaggregation is mapped to a clustering problem with known minimum and maximum group size constraints. In this scheme, the goal is to cluster n records into groups of at least k and at most 2k_1 records, such that the sum of the within-group squ...
متن کاملRepeated Record Ordering for Constrained Size Clustering
One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...
متن کاملImproved Univariate Microaggregation for Integer Values
Privacy issues during data publishing is an increasing concern of involved entities. The problem is addressed in the field of statistical disclosure control with the aim of producing protected datasets that are also useful for interested end users such as government agencies and research communities. The problem of producing useful protected datasets is addressed in multiple computational priva...
متن کاملBeyond Multivariate Microaggregation for Large Record Anonymization
Microaggregation is one of the most commonly employed microdata protection methods. The basic idea of microaggregation is to anonymize data by aggregating original records into small groups of at least k elements and, therefore, preserving k-anonymity. Usually, in order to avoid information loss, when records are large, i.e., the number of attributes of the data set is large, this data set is s...
متن کاملA Comparative Study on Microaggregation Techniques for Microdata Protection
Microaggregation is an efficient Statistical Disclosure Control (SDC) perturbative technique for microdata protection. It is a unified approach and naturally satisfies k-Anonymity without generalization or suppression of data. Various microaggregation techniques: fixed-size and data-oriented for univariate and multivariate data exists in the literature. These methods have been evaluated using t...
متن کامل